On Improved Example-based Search in Digital Libraries via Term Ranking
نویسندگان
چکیده
Example-based searching, where user provides an example publication to locate similar publications to, is becoming commonplace in literature digital libraries. Two approaches to estimate similarities between publications are (i) graph based approaches where citation relationships amongst publication are used to compute similarities, and (ii) text-based approaches where observing shared terms between publications is used as indicator of similarity. In this paper we introduce a new text-based publication-similarity measuring technique that enhances existing example-based searching through utilizing term importance information. Term importance is computed via a proposed graph-based term ranking (GBTR) algorithm. The GBTR algorithm is different from previous term ranking approaches as it recursively computes term importance from the entire publication where it is observed, rather than relying only on local specific information. GBTR works well when paired with Okapi BM25. We exhaustively evaluate the performance of GBTR and compare it against the performance of existing term-ranking methods such as the Chronological Term Rank (CTR) and the Term Proximity models. Significant improvements, in terms of precision, over existing approaches are observed. GBTR achieved around 10% improvement in precision over CTR and around 2% over TP with much less computational time and space complexity than the TP approach.
منابع مشابه
Using Interactive Search Elements in Digital Libraries
Background and Aim: Interaction in a digital library help users locating and accessing information and also assist them in creating knowledge, better perception, problem solving and recognition of dimension of resources. This paper tries to identify and introduce the components and elements that are used in interaction between user and system in search and retrieval of information in digital li...
متن کاملA Plugin Architecture Enabling Federated Search for Digital Libraries
Today, users expect a variety of digital libraries to be searchable from a single Web page. The German Vascoda project provides this service for dozens of information sources. Its ultimate goal is to provide search quality close to the ranking of a central database containing documents from all participating libraries. Currently, however, the Vascoda portal is based on a non-cooperative metasea...
متن کاملA Comparing between the impacts of text based indexing and folksonomy on ranking of images search via Google search engine
Background and Aim: The purpose of this study was to compare the impact of text based indexing and folksonomy in image retrieval via Google search engine. Methods: This study used experimental method. The sample is 30 images extracted from the book “Gray anatomy”. The research was carried out in 4 stages; in the first stage, images were uploaded to an “Instagram” account so the images are tagge...
متن کاملReducing semantic complexity in distributed digital libraries
Purpose – The general science portal ‘‘vascoda’’ merges structured, high-quality information collections from more than 40 providers on the basis of search engine technology (FAST) and a concept which treats semantic heterogeneity between different controlled vocabularies. First experiences with the portal show some weaknesses of this approach which come out in most metadata-driven Digital Libr...
متن کاملFederated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks
Peer-to-peer architectures are a potentially powerful model for developing large-scale networks of text-based digital libraries, but peer-to-peer networks have so far provided very limited support for text-based federated search of digital libraries using relevancebased ranking. This paper addresses the problems of resource representation, resource ranking and selection, and result merging for ...
متن کامل